This subsystem is responsible for ensuring the quality of job execution and managing instance termination. It encompasses two primary quality control mechanisms: one operating within the AWS Lambda environment for pre-execution or post-processing checks, and another integrated into the main job runner for in-execution validation. Additionally, it includes a robust termination handler specifically designed to detect and respond to AWS EC2 spot instance termination notices, allowing for graceful shutdown and data preservation.
Components
QCChecker
Performs quality control checks on the executed job within the runner environment. It runs all defined QC checks by evaluating expressions against QC result files and can abort the Step Functions workflow execution if failures are detected.
TerminationHandler
Manages graceful termination for instances, particularly for spot instances. It periodically checks the EC2 metadata service for termination notices and logs warnings to facilitate graceful shutdown.
LambdaQCChecker
A dedicated quality control component for the AWS Lambda environment. It reads QC results from an S3 path, evaluates a specified expression, and if the check fails, it stops the associated Step Functions execution and raises a QCFailed exception.